Microbe–disease associations prediction by graph regularized non‐negative matrix factorization with L2,1 norm regularization terms

Abstract Microbes are involved in a wide range of biological processes and are closely associated with disease. Inferring potential disease‐associated microbes as the biomarkers or drug targets may help prevent, diagnose and treat complex human diseases. However, biological experiments are time‐consuming and expensive. In this study, we introduced a new method called iPALM‐GLMF, which modelled microbe–disease association prediction as a problem of non‐negative matrix factorization with graph dual regularization terms and L2,1 norm regularization terms. The graph dual regularization terms were used to capture potential features in the microbe and disease space, and the L2,1 norm regularization terms were used to ensure the sparsity of the feature matrices obtained from the non‐negative matrix factorization and to improve the interpretability. To solve the model, iPALM‐GLMF used a non‐negative double singular value decomposition to initialize the matrix factorization and adopted an inertial Proximal Alternating Linear Minimization iterative process to obtain the final matrix factorization results. As a result, iPALM‐GLMF performed better than other existing methods in leave‐one‐out cross‐validation and fivefold cross‐validation. In addition, case studies of different diseases demonstrated that iPALM‐GLMF could effectively predict potential microbial‐disease associations. iPALM‐GLMF is publicly available at https://github.com/LiangzheZhang/iPALM‐GLMF.

Research indicates that they play significant roles in human health and disease, such as maintaining internal equilibrium, 4 developing the immune system 5 and resisting pathogens. 2For instance, studies have indicated that the proliferation of pathogenic bacteria in the oral cavity may lead to an inflammatory disease known as periodontitis. 6The findings demonstrate that periodontitis-associated microbial communities have highly conserved changes in metabolic and virulence gene expression profiles, whereas healthy samples did not.This suggests that changes in the composition of the oral microbial community may be related to the pathogenesis of periodontitis. 7Furthermore, there is clinical and histological evidence that topical application of lactic acid can be effective in depigmenting the skin, improving the surface roughness of the skin and reducing mild wrinkles caused by environmental photodamage. 80][11][12][13] Unfortunately, reliance on traditional experimental methods is both laborious and time-consuming, and it is challenging to fully explore potential microbe-disease associations (MDAs) within a limited timeframe.Consequently, there has been a growing interest in computational models that can predict disease-associated microbes. 14 recent years, numerous computational models, including those based on scoring functions, have been developed to predict potential microbe-disease associations.For instance, Chen et al. 15 proposed the first computational model in this field called KATZHMDA, which is based on the KATZ method.In this model, the prediction of potential associations is transformed into an integration based on the number of walks in the network and its own length.It is a valid metric for calculating the probability of potential associations between microbes and diseases.Huang et al. 16 presented the computational model of Path-Based Human Microbe-Disease Associations prediction (PBHMDA), which is based on a depth-first search algorithm for predicting microbes that may be associated with diseases.
The model generates a prediction score for each microbe-disease association pair by constructing a heterogeneous network and traversing all connection paths between nodes in the heterogeneous network using a specialized depth-first search algorithm.Long and Luo 17 proposed a novel computational model for Weighted Meta-Graph-based Human Microbe-Disease Associations prediction (WMGHMDA).The model iteratively implements a pre-designed weighted meta-graph search algorithm on a heterogeneous information network, and discovers possible microbe-disease pairs by accumulating the contribution value of the weighted meta-graph to the microbe-disease pairs as a probability score.Xu et al. 18 developed a novel computational method to discover potential Microbe-Disease Associations based on the Kronecker Regularized Least Squares (MDAKRLS).The model is designed with Kronecker regularized least squares that have different Kronecker similarities to obtain the prediction scores separately, and the final prediction scores are calculated by integrating the contributions of different similarities.The advantages of these models are that the theory of the algorithms and computational processes involved is relatively easy to understand and the models do not require negative samples for prediction, while the disadvantage is that most models based on scoring functions are not applicable to new diseases.
Researchers have also developed network models for microbedisease associations prediction.For example, Bao et al. 19 proposed the model of Network Consistency Projection for Human Microbe-Disease Associations prediction (NCPHMDA), where the model constructs the similarity of nodes in a heterogeneous network to measure the correlation between microbes and diseases, and calculates the consistency projection score to infer latent microbes for diseases.
Huang et al. 20 22 proposed a correlation prediction method (BRWMDA) based on similarity and improving bi-random walk on the disease and microbe networks.The method utilizes network integration and double random walks on the disease and microbe networks.When the maximum number of iterations of both networks is reached, the random walk stops and produces the final correlation probability matrix.The main advantage of these models is that they can fully utilize the topological information in the network.In addition, these models involve fewer parameters, which greatly reduces the difficulty of parameter selection, but the disadvantage is that some network-based approaches rely heavily on experimentally validated microbe-disease associations, and cannot predict new diseases or microbes in the absence of known association information.
4][25][26][27][28] Concurrently, many computational methods have been developed to help identify the relationship between microbes and diseases.For example, Peng et al. 29  Convolutional Network for Microbe-Disease Associations prediction (MVGCNMDA), which employs specific data augmentation and multi-view attentional blocks to reveal microbes associated with diseases.Chen et al. 32 proposed an approach to predict MDAs based on heterogeneous networks and metapath aggregation graph neural networks (MATHNMDA).The model utilizes heterogeneous networks as inputs to the metapath aggregation graph neural network, and employs aggregation and attention mechanisms among metapaths to integrate the semantic information of all the different metapaths, thereby obtaining the final embeddings of microbial nodes and disease nodes.
The discovery of potential microbe-disease associations will undoubtedly be of great help in research to understand disease pathogenesis and develop treatments for human diseases.Since traditional biological experiments are generally time-consuming and labourintensive, efficient and reliable computational prediction methods are urgently needed.In recent, great progress has been made in developing computational models for predicting potential microbe-disease associations.As a machine learning method, the matrix factorization approach has proven to be an effective tool and has been widely used in bioinformatics research.For example, Gönen 33 proposed a kernelized Bayesian matrix factorization with twin kernels method to predict drug-target interactions.He et al. 34 developed a novel predictive model of Graph Regularized Non-negative Matrix Factorization for Human Microbe-Disease Association prediction (GRNMFHMDA).
In this study, we put forward a novel computational model named iPALM-GLMF, which was a non-negative matrix factorization model based on graph dual regularization terms and L 2,1 norm regularization terms.The graph dual regularization terms were used to integrate the geometric information of the microbe similarity matrix and the disease similarity matrix, and the L 2,1 norm regularization terms were used to ensure the sparsity of the matrices obtained from the non-negative matrix factorization.We then used the non-negative double singular value decomposition (NNDSVD) 35 to provide valid and interpretable initial component matrices for the matrix factorization and used an inertial proximal alternating linear minimization iterative process, which has been shown to converge to the KKT point, to obtain the final resultant matrix factorization. 36erall, our main contributions were summarized as follows: • We introduced a novel approach to improve nonnegative matrix factorization by adding graph dual regularization terms and L 2,1 norm regularization terms.
• By using manifold theory, introducing graph dual regularization terms to efficiently integrate different similarity matrices and preserve manifold features of the data space.
• To improve interpretability and mitigate the effects of inherent noise in the microbe and disease feature spaces, L 2,1 norm regularization terms was applied to the feature matrices to select the most representative or discriminative sparse features.
• NNDSVD was used to initialize the non-negative matrix factorization and to solve the matrix factorization using a fast convergent inertial proximal alternating linearization minimization algorithm.
Numerous experimental results have shown that iPALM-GLMF has better performance than other state-of-the-art methods.
Experimental results on two data sets, that is, HMDAD and Disbiome, indicate that iPALM-GLMF model consistently outperforms the other five state-of-the-art methods.Case studies of three common diseases, colorectal cancer, inflammatory bowel disease (IBD) and asthma, further validate the effectiveness of iPALM-GLMF.

| Human microbe-disease associations
To establish the human microbe-disease interaction network, we retrieved known microbe-disease associations from the Human Microbe-Disease Association Database (HMDAD) (http:// www.cuilab.cn/ hmdad ). 37There were 483 experimentally confirmed microbe-disease associations between 39 diseases and 292 microbes.After removing redundant associations, we obtained 450 associations.In addition, Janssens et al. released a new microbe-disease association database called Disbiome (https:// disbi ome.ugent.be/ home), in which 5573 experimentally confirmed human microbe-disease associations were collected from previously published literature and different databases, including 240 diseases and 1098 microbes. 38In Disbiome, a microbe-disease pair may be recorded multiple times depending on the assay.After filtering out duplicates, we ended up downloading 4351 associations between 218 diseases and 1052 microbes.Overall, the specific statistics of the two microbe-disease association datasets are shown in Table 1.
For better description, we formulated microbe-disease associations as a binary matrix A ∈ ℝ m×d with m and d representing the numbers of microbes and diseases, respectively.If there exists an experimentally verified relationship between a microbe m i and a disease d j , A ij equals to 1, otherwise 0.

| Gaussian interaction profile kernel similarity for diseases and microbes
We calculated Gaussian kernel similarity for microbes and diseases based on the hypothesis that microbes related to common diseases are more likely to show same functions. 39Specifically, since the ith row and the jth column of the adjacency matrix A denote the interactions between microbes m i or disease d j and all microbes or all diseases, we denote IP m i and IP d j as the interaction profiles of microbe m i with disease d j , respectively.The Gaussian kernel similarity between microbes and diseases are defined as follows: TA B L E 1 The information of two microbe-disease associations datasets.

| ME THODS
In this paper, we presented a novel model iPALM-GLMF, which modelled the microbe-disease associations prediction problem as a non-negative factorization problem with graph dual regularization terms and L 2,1 norm regularization terms.iPALM-GLMF took microbe-disease associations matrix A, Gaussian interaction profile kernel similarity of microbes GM and Gaussian interaction profile kernel similarity of diseases GD as inputs, and utilized GM and GD to construct the graph dual regularization terms, and solved the non-negative matrix factorization problem of A using the graph dual regularization terms and L 2,1 norm regularization terms to obtain the feature matrices of microbes and diseases.Finally, the feature matrices were used to predict potential microbe-disease associations.
A brief flow chart of the model iPALM-GLMF is shown in Figure 1.

| Non-negative matrix factorization
The non-negative matrix factorization (NMF) is a method for finding two low-rank non-negative matrices whose product approximates the original non-negative matrix well. 40It incorporates non-negative constraints to obtain a component-based representation and enhances the interpretability of the problem accordingly.
In microbe-disease associations prediction, the non-negative matrix factorization (NMF) of associations matrices is widely used to obtain low-dimensional feature representations of microbes and diseases in matrices space.The general form of NMF is as follows: where X and Y represent the latent feature matrices of microbes and diseases, respectively.k is the rank of X and Y, k ≪ min(m, d), X ∈ ℝ m×k , Y ∈ ℝ d×k .The non-negativity constraint terms are adopted to ensure non-negativity of X and Y.
F I G U R E 1 Flow chart of potential microbe-disease association prediction based on the computational model of iPALM-GLMF.

| Graph dual regularized non-negative matrix factorization
Cai et al 41 proposed a graph regularized non-negative matrix factorization (GNMF) to find a compact representation that reveals hidden semantics while respecting the intrinsic geometric structure.It has been shown that learning performance can be greatly improved if the information about the flow structure contained in the data is utilized. 42,43 addition, Shang et al. 44 introduced graph dual regularization terms based on data manifolds and feature manifolds.
To obtain the geometric information of microbes and diseases, two K-nearest neighbour graphs N m and N d are constructed for microbes and diseases based on GM and GD, respectively.
For two microbes m i and m j , the weight of the edge between vertices i and j in graph N m is defined as follows.
where  K (i) denotes the sets of K most similar microbes of microbes m i according to GM.Based on N m and GM, a sparse matrix ĜM ij is computed as follows: Here The optimization model of graph dual regularization terms nonnegative matrix factorization (GDNMF) of the microbe-disease interaction matrix A is formulated as follows: where m and d are regularization parameters.

| GDNMF with L 2,1 norm regularization terms
Zhang et al. 45 introduced L 2,1 norm regularization terms into graph dual regularization non-negative matrix factorization, which is used to ensure the sparsity of the matrices obtained by the factorization.
The optimization model of GDNMF with L 2,1 norm regularization terms is formatted as follows: where l is a regularization parameter, ‖X‖ 2,1 and ‖Y‖ 2,1 represent L 2,1 norms of matrix X and Y, respectively, and .

| Non-negative double singular value decomposition
Non-negative double singular value decomposition (NNDSVD) is a method that improves the initialization phase of non-negative matrix factorization (NMF) by providing valid and interpretable initial component matrices for matrix factorization. 35Based on the basic property of singular value decomposition, for matrix A, can be expressed as the sum of k leading singular , where is the nonzero singular values of A, and u i , v i k i=1 are the corresponding left and right singular vectors.For a vector or matrix a, a + = max(0, a) represents nonnegative section of a, a − = max(0, − a) represents nonpositive section of a , a = a + − a − .A = ∑ k i=1 i u i v T i can be transformed to the following form:

| Proximal alternating linearized minimization
Bolte et al. 46 introduced a Proximal Alternating Linearized Minimization method (PALM), which has global convergence results for nonconvex and nonsmooth semialgebraic problems.
Model (12) can be derived to the following form: where . The nonnegative constraint of Formula ( 14) can be transformed to the following form: Model ( 14) can be derived to the following form: To solve model ( 17), the Gauss-Seidel method is used.The specific derivations are as follows: , and substitute Y i into (X, Y) to remove the constant term and get X i+1 ∈ argmin , where G X, Y i is smooth function.Then the second-order Taylor series of G X, Y i at a point X i is given by: where ∇ X G is the partial derivative of G with respect to X.
Define the proximal map of f : is the lower semi-continuous function to ensure non-negativity, x is a fixed point, t is a constant.According to the definition of proximal map, the solution of Formula ( 20) is as follows: Then for a sequence X i , Y i i∈ℕ , parameters c i 1 and c i 2 , we have

| Inertial terms
Polyak has showed that the inertial term accelerates the convergence of the standard gradient method while the cost of each iteration remains essentially unchanged. 47A class of proximal methods has been considered by Attouch 48 for maximal monotone operators in the context of second-order differential equations in time.
These methods are called the inertial proximal methods.In PALM, the commonly used optimization scheme is the first-order gradient descent method, thus the inertial term is used in order to speed up the convergence.

| Inertial proximal alternating linearized minimization
Let G denote the objective function of model (12).Then, model (12)   can be expressed as follows: The partial derivatives of the function G with respect to X and Y, respectively, are as follows: Pseudocode for the iPALM-GLMF algorithm.
For sequences

we can get
The detailed steps of iPALM-GLMF are illustrated in Figure 2.
The parameter values used in our model are set based on a previous study. 49| E XPERIMENTS

| Evaluation metrics
To evaluate the performance of iPALM-GLMF, we performed two

| Performance comparison
Based on the cross-validation results, we used AUPR and AUC values as metrics to evaluate the performance of iPALM-GLFM.We compared iPALM-GLFM with the following state-of-the-art methods on the same dataset.
KATZHMDA 15 based on KATZ measure achieves the prediction of potential disease-microbe association through calculating the number and length of paths between two nodes in microbe-disease heterogeneous network.
BiRWHMDA 52 is a method for predicting potential microbedisease associations by double random walks on heterogeneous networks.
ABHMDA 29 is a model for revealing disease-associated microbes through a strong classifier composed of weak classifiers with corresponding weights.
Our method was compared with five baselines under fivefold cross-validation and global LOOCV on two datasets, namely HMDAD and Disbiome.We used AUC values and AUPR values as indicators to evaluate each method.For better visual comparison, the corresponding ROC curves for iPALM-GLMF, KATZHMDA, LRLSHMDA, NTSHMDA, BiRWHMDA and ABHMDA were shown in Figures 3 and 4.
On the HMDAD database, iPALM-GLMF performed best compared to the other five baseline methods, with average AUCs of 0.9464 ± 0.0039 and 0.9587 under fivefold cross-validation and global LOOCV, respectively.On the Disbiome database, iPALM-GLMF performed best compared to the other five baseline methods, with average AUCs of 0.8660 ± 0.0015 and 0.8850 under fivefold cross-validation and global LOOCV, respectively.The results indicated that our method was effective in predicting novel microbe-disease associations.
To further evaluate the validity of our model, the AUPR values under fivefold cross-validation for the two databases were shown in Figure 5.The average AUPR of our method under HMDAD and Disbiome databases were: 0.8476 and 0.4515, respectively, which were better than the baseline method.The performance of iPALM-GLMF on HMDAD, Disbiome at fivefold cross-validation was summarized in Table 2.It could be seen that our method had the best AUC and AUPR on the HMDAD dataset.The main reason may be that the Disbiome was sparser than the HMDAD.The density of HMDAD was 3.95% and the density of Disbiome was 1.90%.Therefore, the iPALM-GLMF method was better trained on HMDAD than Disbiome.

| Ablation experiment
In this section, we sought to determine the impact of several techniques on the performance of our proposed iPALM-GLMF.To this end, we evaluated iPALM-GLMF, iPALM-GLMF (without NNDSVD, i.e.SVD was used in the initialization phase of the matrix factorization), iPALM-GLMF ( m = 0, i.e. the graph regularization term for microbe was not used), iPALM-GLMF ( d = 0, i.e. the graph regularization term for disease was not used), iPALM-GLMF ( l = 0, i.e. L 2,1 norm regularization term was not used) and PALM-GRMF (i.e.inertial forces was not used).The results of the above settings were shown in Tables 3 and 4. In fivefold cross-validation, the iPALM-GLMF showed better performance than in other settings.

| Case studies
To further evaluate whether iPALM-GLMF could demonstrate accurate and robust performance, we conducted case studies on two different types of diseases under colorectal cancer, inflammatory bowel disease (IBD) and asthma.These studies were conducted using the HMDAD database.
In the first case study, we performed potential microbe prediction for colorectal cancer and Inflammatory bowel disease (IBD).
Specifically, we categorized all unknown samples under the same disease and verified whether the association between the top 10 microbes and the disease under study was validated by relevant literature.Colorectal cancer is the second leading cause of cancer deaths in the United States, and the incidence in young adults is increasing each year. 53

TA B L E 4 AUPR values of different algorithms under fivefold cross-validation.
The main types of inflammatory bowel disease (IBD), which include Crohn's disease and ulcerative colitis, are caused in part by bacteria that may activate the patient's immune system to attack foreign bodies. 56Once activated, the patient's immune system has difficulty regulating and destroying the gastrointestinal tract, leading to IBD symptoms.Recent research indicates a close correlation between various microorganisms and IBD.For instance, a reduction in members of the phyla Bacteroidetes and Firmicutes has been observed in IBD, particularly across different variants. 57As shown in Table 6, we implemented iPALM-GLMF to discover potentially relevant microbe for IBD and found that 10 of the top 10 predictions were confirmed by relevant literature.The high prediction accuracy suggested that our model could be used for real-life applications.
In the second case study, we performed relevant microbe prediction for asthma with the goal of evaluating the model's ability to predict associations between unknown microbes and disease in the absence of any known relevant microbes.Specifically, we replaced all microbes associated with a specific disease in the adjacency ma- constructed a new model to reveal potential microbialdisease associations by integrating two independent recommendation models: a neighbour-based prediction model and a graph-based prediction model.Wu et al. 21presented a novel computational model employing Random Walking with Restart optimized by Particle Swarm Optimization (PSO) on the heterogeneous interlinked network of Human Microbe-Disease Associations (PRWHMDA).The model optimizes the random walk and restart of a heterogeneous network of human microbe-disease associations, using a PSO to optimize the random walk parameters and obtain the final association probability vector.Yan et al.
developed a model of Adaptive Boosting for Human Microbe-Disease Associations prediction (ABHMDA), which reveals microbes associated with a disease by a strong classifier consisting of weak classifiers with their own weights.ABHMDA assigns different weights to multiple weak classifiers to get the final association.Wang et al. developed a semi-supervised computational model of Laplacian Regularized Least Squares for Human Microbe-Disease Associations (LRLSHMDA) with good results.Li et al. 30 proposed a novel computational method called BPNNHMDA.The method takes advantage of the fact that the neural network model, including a unique activation function and optimized initial connection weights based on Gaussian interaction profile kernel similarity, which effectively improves the training speed of the model.Hua et al. 31 developed a model of Multi-View Graph | 3 of 12 CHEN et al.
m and d represent the normalized kernel bandwidths and are defined as follows: where ′ m and ′ d are the original bandwidths, and generally both are set to 1.
ĜM is a weight matrix representing the microbes neighbour graph.The graph Laplacian of ĜM is  m = D m − ĜM, where D m is a diagonal degree matrix with D m ii = ∑ r ĜM ir .Similarly, the weight matrix ĜD corresponding to the diseases neighbour graph is computed as follows: The graph Laplacian of ĜD is  d = D d − ĜD, where D d is a diagonal degree matrix with D d jj = ∑ q ĜD jq .The normalized graph Laplacian forms of  m and  m are as follows: types of cross-validations, namely global LOOCV and fivefold crossvalidation, in the datasets HMDAD and Disbiome.In global LOOCV, each time we took turns to select a sample from the recorded microbe-disease associations as a test sample and trained our model with the remaining known associations, and finally, the test sample was sorted with all unrecognized microbe-disease pairs.In fivefold cross-validation, the known microbe-disease associations were randomly and uniformly divided into five parts, and each part was sequentially picked as a test sample, and the remaining four parts were used as training samples.As with global LOOCV, all unknown microbe-disease pairs were treated as candidate samples.To mitigate the possible impact of sample partitioning on the prediction effect, we randomly partitioned the known microbe-disease associations 100 times.In two cross-validations, we implemented iPALM-GLMF to obtain a list of scores for all microbe-disease pairs and ranked the scores of each test sample against the scores of the candidate samples.If the test sample ranked before a given threshold, we assumed that the model successfully predicted the association.Notably, we recalculated the microbe (disease) Gaussian interaction profile kernel similarity during each LOOCV and fivefold cross-validation because the adjacency matrix changed when one or some of the known microbe-disease associations were removed.
is a semi-supervised learning calculation model based on Laplacian regularized least squares classification.NTSHMDA 51 is a random-walk based predictive model which predicts human microbe-disease associations by integrating network topological similarities.

F I G U R E 4 | 9 of 12 CHEN
The graphs show the AUCs of iPALM-GLMF in fivefold cross-validation (0.8660) and global LOOCV (0.8850) under the Disbiome database, respectively, which outperformed all the aforementioned models (KATZHMDA, LRLSHMDA, NTSHMDA, BiRWHMDA and ABHMDA).F I G U R E 5 Comparison of AUPR valuesfor iPALM-GLMF and five other methods using fivefold cross-validation.et al.

6 | 6
trix with zero.After model prediction, we validated the number of microbes sampled in the top 20 ranked diseases confirmed in the relevant literature, as shown in Figure 6, with results demonstrating that HMDAD has included 2 microbes as well 16 microbes that have been confirmed in the literature, with only 2 that have not yet been validated.In other words, 18 of the top 20 microbes predicted by our model have been confirmed, further demonstrating the validity of iPALM-GLMF.CON CLUS ION Recognizing potential microbe-disease associations not only contributes to disease diagnosis, treatment and prognosis, but also to microbeoriented therapies in precision medicine.In this article, we proposed a novel matrix factorization-based model called iPALM-GLMF for inferring potential microbe-disease associations.In the model, we combined graph dual regularization terms with L 2,1 norm regularization terms, which was used to capture information about the geometric structure in the microbe similarity matrix and the disease similarity matrix and to ensure the sparsity of the matrices obtained from the nonnegative matrix factorization.We then solved the matrix factorizations with graph dual regularization terms and L 2,1 norm regularization terms using an inertial proximal alternating linearization minimization TA B L E 6 The top 10 potential microbes related to inflammatory bowel disease identified by iPALM-GLMF.Prediction results of top-20Asthma-associated microbes.
Therefore, there is an urgent need for novel and sensitive biomarkers that can detect colorectal cancer in an effective and timely manner.Researchers have linked many microbes to colon cancer.For example, D. A. Geier and M.
55 Geier54unearthed a potential link between Clostridium difficile intestinalis infection and colon cancer incidence, finding that adults with Clostridium difficile intestinalis had a significantly increased incidence of colon cancer.Another example is the discovery by Ralser et al.55that Helicobacter pylori promotes colorectal carcinogenesis by deregulating intestinal immunity and inducing a mucus-degrading microbiota signature, based on the evidence they provide suggesting that H. pylori infection is a strong causal facilitator of colorectal carcinogenesis.As shown in Table5, we implemented iPALM-GLMF to discover potentially relevant microbe for colorectal cancer and found that 9 out of the top 10 predictions were confirmed by relevant literature.TA B L E 2Performance of the iPALM-GLMF method under fivefold cross-validation on two datasets.Note:The maximum AUPR on each dataset is shown in bold. Stndard deviation is shown in parentheses.
TA B L E 5The top 10 potential microbes related to colorectal cancer identified by iPALM-GLMF.